How People Use ChatGPT

2025-09-17
출판일: 2025-09-15
저자: Aaron Chatterji, Thomas Cunningham, David J. Deming, Zoe Hitzig, Christopher Ong, Carl Yan Shan, Kevin Wadman

Abstract

Despite the rapid adoption of LLM chatbots, little is known about how they are used. We document the growth of ChatGPT’s consumer product from its launch in November 2022 through July 2025, when it had been adopted by around 10% of the world’s adult population. Early adopters were disproportionately male but the gender gap has narrowed dramatically, and we find higher growth rates in lower-income countries. Using a privacy-preserving automated pipeline, we classify usage patterns within a representative sample of ChatGPT conversations. We find steady growth in work-related messages but even faster growth in non-work-related messages, which have grown from 53% to more than 70% of all usage. Work usage is more common for educated users in highly-paid professional occupations. We classify messages by conversation topic and find that “Practical Guidance,” “Seeking Information,” and “Writing” are the three most common topics and collectively account for nearly 80% of all conversations. Writing dominates work-related tasks, highlighting chatbots’ unique ability to generate digital outputs compared to traditional search engines. Computer programming and self-expression both represent relatively small shares of use. Overall, we find that ChatGPT provides economic value through decision support, which is especially important in knowledge-intensive jobs.

3. Data and Privacy

The privacy-preserving classification pipeline:

Messages are categorized according to 5 different LLM-based classifiers. The classifiers are introduced in more detail in Section 5, their exact text is reproduced in Appendix A, and our validation procedure is described in Appendix B. …

The messages are … then classified according to classifiers defined over a controlled label space—the most precise classifier we use on the message-level data set is the O*NET Intermediate Work Activities taxonomy, which we augment to end up with 333 categories.

Summary:

To summarize, the key elements of our approach are:

Automated classification of messages. In the course of analysis, no one ever looked directly at the content of user messages: all of our analysis of the content of user messages is done through output of automated classifiers run on de-identified and PII-scrubbed usage data.

Aggregated employment data via a data clean room. We analyze and report aggregated employment data through a secure data clean room environment: no one on the research team had direct access to user-level demographic data and none of our analyses report aggregates for groups with less than 100 users.

메모

다른 내용보다도 “privacy-preserving automated pipeline”에 관심이 가서 읽어봤다. 사람 대신 LLM이 읽고 사전에 정의해둔 범주(Controlled vocabulary) 내에서 분류한다는 점 등은 마침 얼마 전에 한 프로젝트에서 고안했던 방식과 거의 동일했다. —ak, 2025-09-17